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The  Decades  of  My  Life 


The  Decades  of  My  Life 

The  development  of  data  archives  and  of 
local  data  libraries  and  the  growth  of 
lASSIST  and  of  associated  organizations 
are  a  function  of  the  growth  of  the 
quantitative  social  and  behavioral 
sciences.  This  growth  in  turn  was  made 
possible  by  the  concomitant  development 
of  computers  and  of  statistics. 


bv  Judith  Rowe' 


I  have  attempted  to  place  these  events  in  a  larger  social  and 
political  context.  In  order  to  do  this  I  have  taken  advantage 
of  the  number  of  magazine  and  newspaper  articles  and  the 
number  of  web  pages  which  are  currently  reviewing  the 
twentieth  century. 

I  have  thoroughly  enjoyed  looking  back  over  this  history, 
and  I  only  regret  that  I  have  had  to  omit  so  many  names 
and  so  many  events.  I  make  no  apologies  for  the  fact  that 
my  emphasis  is  American;  my  memories  are  largely 
American.  Nor  do  I  claim  that  the  names  and  events  I've 
included  are  the  most  significant.  Others  might  present  a 
very  different  history. 

I  begin  in  the  I930's  because  that's  when  I  was  bom.  the 
decade  in  which  I  started  elementary  school  where  I  fought 
for  the  right  of  girls  to  wear  slacks  to  school. ..the  beginning 
of  my  career  as  an  advocate. 

The  1930's 

The  1930"s  brought  passenger  airlines,  LIFE  magazine, 
Monopoly,  Mickey  Mouse  and  Snow  White,  the  "Golden 
Age"  of  radio,  drive-in  movies  and  such  classic  films  as 
"Gone  With  The  Wind,"  the  Empire  State  Building  and  the 
Golden  Gate  Bridge. 

The  Great  Depression  brought  alphabet  soup  to 
Washington,  with  agencies  such  as  the  Social  Security 
Administration,  a  ready  market  for  data  processing 
equipment.  Japan  invaded  China,  Edward  VIII  abdicated  to 
marry  Wallis  Warfield  Simpson.  Hitler  rose  to  power  in 
Germany,  and  there  were  other  brands  of  fascism 
elsewhere. 


Scholars,  primarily  but  not  exclusively 
Jewish,  fled  to  the  U.S.  The  electron 
microscope  was  developed  at  the 
University  of  Toronto,  and  the  Dionne 
quintuplets  were  bom.  The  Literary 
Digest  poll  of  1936  predicted  Landon 
over  Roosevelt  and,  at  the  close  of  the 
decade  the  New  York  World's  Fair,  the 
first  regularly  scheduled  TV,  and  war  in 


Europe. 


Unit  record  equipment  based  on  Jacquard  weaving  cards 
had  been  developed  more  than  30  years  before  by  Herman 
Hollerith  to  analyze  the  1890  Census  in  the  United  States 
and  was  still  in  use.  This  equipment  included  numeric 
keypunches,  sorters  and  later  accounting  machines,  and  the 
famous  101  widely  used  to  tabulate  polhng  results. 

In  1936  the  Englishman  Turing  defined  "the  Turing 
machine." 

Vannevar  Bush  developed  an  analog  computer,  and  the  first 
tmly  electronic  computer  was  built  in  Iowa  in  the  late  30's; 
the  ABC.  as  it  was  called,  even  used  binary  arithmetic. 

The  Lynds.  Lloyd  Warner  and  others  were  doing 
community  studies  in  Middletown  and  Yankee  City. 

Morris  Hanson  began  the  development  of  large-scale 
sampling,  but  the  Gallup  Polls  were  begun  by  George 
Gallup  using  "quota  samples." 

Attitude  measurement  matured  under  the  Allports,  Lickert 
and  Bogardus.  Hadley  Cantril  did  surveys  throughout  the 
world,  and  his  multi-punched  cards  were  recently  located, 
converted  to  tapes  and  are  now  available  at  ICPSR. 

Pubhc  Opinion  Quarterly  was  started  in  1937,  and  from 
1940-51  POQ  carried  a  special  section  on  poll  results. 

New  economic  censuses  and  national  surveys  of 
unemployment  and  crop  production  were  instituted,  and  the 
Brookings  Institution  was  a  going  concern. 
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The  1940's 

The  1940's  saw  the  U.S.  alUed  with  Europe  on  both  the 
Western  and  Asian  fronts,  Russia  joining  the  western 
aUiance,  and  women  at  work  in  factories  and  in  offices  and 
in  such  miUtary  units  as  the  WACS  and  the  WRENS.  Many 
continued  to  work  when  the  war  ended  in  1945  after  atomic 
bombs  had  been  dropped  on  two  Japanese  cities. 

All  over  the  world  soldiers  became  students,  the  oldest 
cadre  of  students  the  world  has  ever  known.  The  provision 
of  veterans'  benefits  required  enormous  record-keeping 
efforts.  Countries  in  Europe  and  Asia  had  seen  their  records 
of  government  destroyed,  and  the  post-war  period  provided 
an  opportunity  to  begin  anew.  Israel  is  established  as  a 
Jewish  state,  Mao  proclaims  the  establishment  of  the 
People's  Republic  of  China,  and  Newfoundland  becomes 
Canada's  tenth  province. 


(AAPOR)  was  formed  in  1947,  and  in  1948,  immediately 
after  the  election  in  which  the  pollsters  chose  Dewey  over 
Truman,  a  committee  chaired  by  Fred  Stephan  and 
S.S.Wilks  was  appointed  to  find  out  why. 

Samuel  Stouffer  completed  the  monumental  American 
Soldier  study,  data  for  which  are  now  available  from 
Roper,  and  in  1948  the  Elmira  Study.  When  Stouffer  died. 
Harvard  sent  the  cards  for  the  American  Soldier  to  Roper. 
It  was  not  until  the  late  70's  that  the  Department  of  Defense 
provided  funding  to  read  the  cards  to  tape,  develop 
codebooks,  and  send  a  copy  to  National  Archives. 

Guttman  commenced  his  work  on  scaling  theory,  and 
Deutsch,  Russett  and  Merritt  on  quantitative  models  of 
nationalism  and  integration.  Stouffer,  Lazersfeld  and 
Anderson  developed  multivariate  analytic  techniques  based 
on  the  work  of  Pearson,  Yule  and  Fisher. 


The  World  Bank  and  the  International  Monetary  Fund  are 
founded,  and  a  year  later  in  1945  the  UN  with  its  many 
agencies  including  UNESCO.  W.  Edwards  Deming 
commutes  to  Japan  to  organize  their  census  and  to  teach  the 
principles  of  quahty  control. 

Antibiotics,  the  Big  Bands  and  the  jitterbug,  abstract 
expressionism,  and  Orson  Welles'  "Citizen  Kane." 
Television  features  the  World  Series,  the  Amateur  Hour 
and  the  longest  running  program  in  history  "Meet  the 
Press,"  and  the  unbreakable  vinyl  LP  replaces  the  shellac 
record. 


The  Rand  Corporation,  the  Urban  Institute  and  NORC  were 
all  established  in  the  1940's,  and  at  the  end  of  the  decade 
the  Social  Science  Research  Council  supported  a 
conference  on  political  behavior.  At  the  close  of  the  decade 
the  Social  Science  Research  Council  (SSRC)  supported  a 
Conference  on  Pohtical  Behavior  chaired  by  Pendelton 
Herring. 

I  got  married  in  the  1950's  while  a  graduate  student  at  Yale. 
I  then  spent  several  years  learning  about  marketing  research 
while  my  husband  served  in  the  U.S.  Navy,  and  before  the 
decade  was  over  I  was  the  mother  of  two  sons. 


MARK  I,  programmed  by  paper  tape,  was  followed  in  1943 
by  ENIAC,  developed  by  Eckert  and  Mauchley  at  the 
University  of  Pennsylvania.  This  was  followed  in  turn  by 
EDVAC,  EDSAC,  ILIAC,  JOHNIAC,  MADM  and  others, 
moving  from  digital  to  binary,  using  memory  to  store  both 
programs  and  data  and  adding  serial  processing  units.  The 
transistor  was  invented  in  1947,  and  magnetic  core  memory 
in  1949. 

In  1946  the  Roper  Center  was  created  at  WilUams  College 
as  a  home  for  Gallup,  Crossley  and  Roper  Polls,  some  from 
as  early  as  1936.  The  Center  was  run  for  decades  by  PhiUp 
and  Elizabeth  Hastings. 

Angus  Campbell  began  work  on  attitude  surveys  and 
opinion  polhng,  and  the  forerunners  of  The  American 
National  Election  Surveys  were  completed  in  1944  and 
1948. 

The  American  Association  for  Public  Opinion  Research 


The  1950's 

The  1950"s  saw  women  back  in  the  home,  mid-calf  skirts 
replacing  mid-knee  skirts,  the  beginning  of  the  baby  boom, 
men  in  gray  flannel  suits,  large  new  housing  developments 
on  former  potato  fields,  Russell  Wright  dishes,  Danish 
modem  furniture,  and  in  the  United  States  a  million  and  a 
half  TV's  playing  "I  Love  Lucy"  and  the  Ed  Sullivan  Show. 

Later  in  the  decade  Xerox  manufactures  a  plain  paper 
copier,  the  seeds  of  the  civil  rights  and  women's 
movements  are  planted,  the  Korean  War  continues  into  the 
Eisenhower  years,  and  McCarthy  runs  riot  through 
Hollywood,  the  universities  and  on  TV.  The  polarization  of 
East  and  West  resuUs  in  the  "cold  war." 

Germany  and  Japan  industrialize,  and  the  centralizing  of 
governments  requires  more  data  for  every  purpose.  Egypt, 
India  and  Ireland  become  independent  republics,  Stalin 
dies,  the  Warsaw  Pact  is  signed,  the  USSR  launches 
Sputnik,  and  the  Hungarian  Revolution  is  suppressed  by 
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Soviet  troops  in  the  same  year  in  which  I  had  my  first  child. 
Fidel  Castro  becomes  ruler  of  Cuba,  and  Mexican  women 
get  the  right  to  vote. 

In  1951  UNI  VAC  1,  the  first  alphanumeric  computer  is 
produced,  and  Walter  Cronkite  uses  UNIVAC  2  to  predict 
the  1952  election.  Unable  to  believe  the  computer  report  of 
such  a  complete  Eisenhower  sweep,  he  fails  to  report  it. 

In  1953  IBM  announced  their  first  real  computer,  the  701. 
This  was  folowed  in  turn  by  numerous  descendents 
including  the  704,  designed  by  Gene  Amdahl,  and  in  1958 
the  709,  whose  competitors  included  the  CDC  1604. 
Computers  were  also  being  built  in  England,  in  Germany 
and  in  Japan. 

IBM  sold  450  Of  their  first  mass-produced  computer,  the 
650,  in  1954.  This  was  very  popular  on  college  campuses 
for  a  number  of  years. 

In  the  mid-fifties  Bell  Labs  announced  the  first  fully 
transistorized  computer,  MIT  began  work  on  direct 
keyboard  input,  today's  normal  mode  of  operation  and 
SAGE  (Semi-Automatic  Ground  Environment)  linked 
hundreds  of  radar  stations  in  the  US  and  Canada  in  the  first 
large-.scale  computer  network. 

The  COBOL  compiler  was  developed  by  Grace  Hopper  in 
1952,  and  FORTRAN  by  Paul  Nutt  in  1957.  These  were 
followed  in  turn  by  ALGOL,  LISP  and  APL. 

Tape  drives  were  developed,  and  tapes  written  at  200  bpi 
could  store  the  contents  of  70,000  cards. 

York  Lucci  and  Stein  Rokkan  wrote  their  seminal  paper  on 
the  role  of  the  traditional  hbrary  in  providing  access  to 
data. 

The  Human  Resources  Area  Files  developed  at  Yale  to 
collect  data  from  anthropologists,  and  the  International 
Data  Library  opened  its  doors  at  Berkeley  to  collect  Third 
World  survey  data. 

The  Institute  of  Social  Research  flourished  at  Michigan  and 
the  Bureau  of  Applied  Social  Research  at  Columbia. 
Survey  research  and  sampUng  were  here  to  stay... or  so  we 
thought. 

Almond  and  Verba  completed  the  Civic  Culture;  Dahl  the 
New  Haven  Study  and  numerous  Health  Surveys;  and  the 
Wisconsin  Longitudinal  Study  and  the  American  National 
Election  Survey  began  their  long  histories. 


Campbell  described  "The  Archival  Resources  of  SRC"  in 
POQ  and  later  in  the  decade  POQ  ran  a  special  issue  on 
archives  in  which  Converse  described  "A  Network  of  Data 
Archives  for  the  Social  Sciences,"  and  Philip  Hastings 
described  the  growing  holdings  of  the  Roper  Center;  3,200 
surveys  from  70  countries. 

My  child-bearing  years  ended  in  the  sixties  with  the  birth  of 
my  daughter  and  by  the  end  of  the  decade  I  was  employed 
part-time  at  Princeton's  Office  of  Survey  Research  and 
Statistical  Studies.  I  soon  attended  my  first  ICPR  meeting 
and  was  designated  Princeton's  OR. 

The  1960's 

The  1960's  were  the  years  of  the  Beatles  and  the  flower 
children,  of  birth  control  pills,  zip  codes  and  of  John  F. 
Kennedy  and  Camelot,  of  the  confinued  expansion  of  the 
Vietnam  War,  of  space,  of  the  Chinese  Cultural  Revolution, 
of  Martin  Luther  King,  Lyndon  Johnson's  "Great  Society" 
and  the  building  of  the  Berlin  Wall  and  of  Pearson  and 
Trudeau  as  Canadian  Prime  Ministers.  In  1963  women  are 
allowed  to  vote  in  Iran,  Kennedy  is  shot  in  Dallas,  TX, 
Canada  adopts  the  Maple  Leaf  as  its  flag,  and  Indira  Gandi 
becomes  India's  Prime  Minister. 

By  the  close  of  the  decade  there  are  200  million  TV's 
world-wide  with  78  million  in  the  U.S.  The  first  successful 
human  heart  transplant  is  performed  and  American  Airlines 
launched  their  SABER  system  for  airline  reservations. 

Second-generation  computers  such  as  the  IBM  7090  and 
the  CDC  3600  opened  the  decade.  In  1963  the  PDP-8  was 
a  runaway  success,  and  IBM  sold  its  CADET  (Can't  add 
doesn't  even  try)  later  designated  the  1620. 

The  price-tag  on  computers  was  in  the  multi-millions  with 
the  giant  STRETCH  and  its  competitor  the  CRAY  costing 
in  the  vicinity  of  $8  million  although  Data  General's  Nova 
with  32  kilobytes  of  memory  had  become  available  for 
$8,000. 

By  the  middle  of  the  decade  the  IBM  360  was  produced, 
and  the  disk  had  replaced  the  drum.  Time-sharing  had 
arrived,  and  800  bpi  tapes  were  just  beginning  to  replace 
556  bpi. 

John  Kemeny  wrote  BASIC,  and  UNIX  emerged  from 
MIT's  Project  MAC  and  was  further  developed  at  Bell 
Labs. 

This  was  the  era  of  batch  processing,  of  punched  cards,  and 
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of  matrix  printing  on  green-bar  paper,  but  also  the  time  in 
which  ASCII  was  developed  making  it  possible  for 
machines  from  different  manufacturers  to  exchange  data. 

All  of  the  major  statistical  packages  as  well  as  many  long 
gone  saw  the  light  of  day.  David  Armour  wrote  DATA- 
TEXT  at  Harvard  in  assembly  language  for  the  7094. 
Norman  Nie  produced  SPSS  at  Stanford,  and  Roald  Buhler 
wrote  P-STAT  for  the  psychometricians  and  experimental 
psychologists  at  ETS  and  Princeton.  BIOMED  was 
developed  at  UCLA.  Ken  Janda  wrote  NUCROS  at 
Northwestern  and  Ed  Myers  wrote  a  time-sharing  package, 
IMPRESS,  at  Dartmouth.  SUPPAK  was  produced  at 
Illinois,  and  the  now  ubiquitous  SAS  was  developed  by  the 
agricultural  statisticians  in  North  Carolina.  Nonetheless 
most  social  scientists  were  still  using  the  card  sorter  and  the 
Friden  or  Monroe  calculators.  Simple  locally  written 
software  packages  seldom  went  beyond  cross-tabulations 
and  chi-square. 

ICPR  was  established  by  Warren  Miller  in  1962  as  a 
consortium  of  eight  institutions;  the  Zentralarchive  was 
established  in  Cologne  by  Erwin  Scheuch;  and  archives 
were  estabhshed  at  Essex,  Amsterdam  and  Bergen  some 
years  later. 

Jerry  Clubb  arrived  at  ICPR  in  1965  to  begin  the 
conversion  to  machine-readable  form  of  quantitative 
historical  data  including  census,  election  and  roll-call  data 
going  back  to  1790. 

In  that  same  year  the  Louis  Harris  Data  Center  the  first 
state  supported  data  archive  was  established. 

Under  the  auspices  of  UNESCO,  the  International  Social 
Science  Council  and  the  National  Science  Foundation 
(NSF)  three  conferences  on  data  archives  were  held 
between  1963  and  1965.  They  addressed  archiving 
aggregate  national  statistics,  comparing  nations,  and  the 
organization  of  data  banks  and  archives. 

The  Council  of  Social  Science  Data  Archives  was  funded 
by  NSF  in  1967,  and  archive  directors  and  some  senior 
staff  from  Michigan,  UCLA,  Columbia,  Berkeley,  Yale, 
Wisconsin  and  the  Roper  Center,  joined  their  European 
counterparts  in  meetings  at  UCLA  in  1967,  at  a  workshop 
at  UNC  in  1968,  and  later  that  year  in  Pittsburgh  (my  first 
professional  trip  and  my  first  flight  on  a  jet  plane)  and 
finally  in  Wisconsin  in  1968. 

Local  data  services  were  in  place  at  Princeton, 
Northwestern,  at  the  Universities  of  British  Columbia  and 


North  Carolina  as  well  as  at  Wisconsin  and  Yale.  At 
Princeton  the  library  was  already  paying  the  ICPR 
membership. 

It  was  the  beginning  of  the  Current  Population  Surveys,  the 
Hospital  Discharge  Surveys,  the  Panel  Study  of  Income 
Dynamics,  the  National  Fertility  Surveys  (later  to  become 
the  Surveys  of  Family  Growth),  National  Longitudinal 
Studies  of  Labor  Force  Participation  (then  widely  known  as 
the  "Pames  data"),  national  election  surveys  in  Canada  and 
western  Europe,  the  World  Handbook  of  Political  and 
Social  Indicators,  and  the  heyday  of  cross-national 
research. 

A  1/1000  and  a  1/10,000  Pubhc  Use  Sample  from  the  U.S. 
1960  decennial  census  was  released  to  a  few  selected 
researchers  on  punched  cards  and  later  on  tape.  It  contained 
both  household  and  person  records  but  no  code  to  link  one 
to  the  other. 

IBM  supported  six  regional  conferences  in  the  humanities 
which  culminated  in  1967  in  a  publication  edited  by 
Edward  Bowles  and  Joe  Raben  began  Computers  in  the 
Humanities  (CHUM). 

In  the  1970's  I  became  actively  involved  in  the  burgeoning 
data  movement,  traveled  to  Europe  at  least  once  each  year 
for  a  meeting  related  to  social  science  data  and  information, 
developed  a  census  data  service  and  Social  Science  User 
Services  and  the  Princeton-Rutgers  Census  Data  Project 
were  ensconced  at  the  computer  center. 

The  1970's 

The  1970's  saw  the  first  of  the  "baby  boomers"  reach 
maturity.  Vietnam  protestors  attacking  University  computer 
centers  and  finally  the  end  of  the  Vietnam  War;  Nixon, 
Watergate  and  the  Pentagon  Papers  were  followed  by  new 
Freedom  of  Information  and  Privacy  legislation,  by  the  first 
non-Italian  Pope  since  1522  and  by  the  death  of  Elvis 
Presley.  "Our  Bodies  Ourselves"  was  a  best-seller. 
Environmental  concern  groups  became  more  active,  and 
crack  cocaine  made  its  first  documented  appearance.  South 
Africa  is  expelled  from  the  United  Nations.  In  an  era  of 
prosperity,  conservative  governments  were  elected 
everywhere,  including  Margaret  Thatcher  in  the  United 
Kingdom,  and  right-wing  rehgious  groups  were  active  in 
many  countries. 

Early  in  the  decade  Intel  builds  the  microprocessor,  the  8- 
inch  floppy  diskette  was  invented,  and  by  1978  the  5  1/4 
inch  floppy  was  on  the  market. 
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The  Wang  word  processing  machine  was  shortly  followed 
by  the  release  of  the  Atari,  the  Tandem,  the  APPLE  I,  and 
in  1977  the  Radio  Shack  Tandey,  the  Commodore  PET  and 
the  APPLE  n,  the  last  three  of  which  were  instant  market 
successes.  In  the  first  month  of  sales  10,000  Tandeys  were 
sold. 

With  the  founding  of  Microsoft  and  Apple,  Steve  Jobs  and 
Bill  Gates  were  the  heroes  of  the  microcomputer  industry. 

dBASE,  VISICALC  and  WORD  STAR  were  the  best 
selling  software  products. 

On  the  other  end  of  the  spectrum  the  US  Department  of 
Defense  established  four  nodes  on  the  ARPANET,  and  by 
the  end  of  the  decade  there  was  widespread  use  of  online 
and  timesharing  systems.  The  370  which  supported  many 
of  these  systems  had  10  million  operating  instructions  as 
compared  to  the  650's  5,000. 

Online  bibliographic  services  like  Dialog,  BRS  and  ORBIT 
came  into  their  own. 

The  U.S.  Census  released  off-the-shelf  data  products,  both 
aggregate  and  sample  data,  and  there  was  a  growing 
involvement  of  traditional  libraries  in  providing  data 
services. 

The  American  Library  Association  constituted  a 
subcommittee  to  recommend  rules  for  cataloging  machine- 
readable  data  files,  and  AACR2  added  Chapter  9  with  those 
recommendations.  By  1972  several  academic  libraries 
began  to  catalog  census  data  in  a  form  other  than  print. 

ICPR  added  an  'S'  to  become  a  general  purpose  social 
science  data  archive.  It  started  the  decade  distributing  more 
than  28  million  card  images  and  ended  it  distributing  over 
438  million  card  images.  By  the  end  of  the  next  decade 
that  number  had  reached  4  billion. 

Programs  on  data  services  were  presented  at  ALA,  SLA, 
APLIC,  ASIS  and  WAPOR  conferences,  and  ACM/ 
SIGSOC  was  an  active  force  in  the  development  of 
statistical  computing. 

NBER  organized  a  conference  in  New  York  on  data  issues, 
and  ICPSR  cooperated  with  the  Bentley  Library  on  a 
conference  on  archival  management  of  machine-readable 
records. 

lASSIST  was  organized  at  a  meeting  in  Toronto  sponsored 
by  the  World  Congress  of  Sociology  and  hosted  by  Mike 


Aiken.  Carolyn  Geda  was  the  first  president. 

A  rash  of  other  new  organizations  included  IFDO,  APDU, 
QUANTUM,  GODORT,  the  Social  Science  History 
Association,  the  European  Political  Science  Consortium, 
the  Canada  Data  Clearinghouse  and  the  Association  for 
Computing  in  the  Humanities. 

IASSIST  met  in  London,  Edinburgh,  Cocoa  Beach, 
Toronto,  Itaska,  Uppsala  and  Ottawa. 

The  first  International  Conference  for  Databases  in  the 
Humanities  and  Social  Sciences  (ICDBHSS)  was  held  at 
Dartmouth  in  conjunction  with  ACH. 

The  Danish  Data  Archive  (Dansk)  was  established  in  1973, 
and  the  European  archives  sponsored  meetings  on  the 
Study  Description. 

Introductory  training  for  new  data  librarians  became  a 
mainstay  of  lASSIST  conferences,  a  data  Ubrary  workshop 
was  offered  at  Wisconsin,  and  a  course  on  machine- 
readable  data  was  offered  by  Sue  Dodd  at  the  UNC  Library 
School.  The  first  regular  data  library  workshop  was  held  at 
ICPSR,  a  program  maintained  to  this  day,  and  the  U.S. 
Census  began  offering  seminars  for  librarians. 

Public  use  microdata  samples  from  censuses  were  released 
on  tape  by  the  U.S.,  Canada  and  Papua  New  Guinea. 

A  growing  number  of  federal  agencies  including  the 
National  Center  for  Health  Statistics,  the  Bureau  of  Labor 
Statistics,  the  National  Center  for  Education  Statistics,  and 
STATSCAN  among  others  began  releasing  a  wider  range 
of  non-census  public  data  products. 

1978  was  a  landmark  year.  NSF  funded  the  National 
Conference  on  Cataloging  and  Information  Services  for 
Machine-Readable  Data  Files  at  Airlie  House  in  Virginia. 
The  recommendations  of  that  conference  led  to  the 
development  of  a  MARC  format  for  these  materials. 
Patrick  Bova  of  National  Opinion  Research  Center 
provided  catalog  facsimile  and  a  bibliographic  citation  on 
the  verso  of  the  title  page  of  the  codebook  for  the  General 
Social  Survey  which  had  been  initiated  in  1972.  This 
provided  a  model  for  ICPSR  and  others.  In  that  same  year 
the  Office  of  Statistical  Policy  and  Standards  of  OMB,  now 
OIRA,  established  a  federal  task  force  to  develop 
descriptive  standards  for  computer  files. 

By  the  1980's  I  was  working  full-time  plus  and  served  as 
president  of  lASSIST,  APDU,  and  COPAFS  as  well  as  a 
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member  of  the  ICPSR  Council. 

The  1980's 

The  1980"s  saw  Reagan  replace  Carter,  increased  inflation, 
mounting  pubhc  debt,  government  deregulation,  AIDS,  the 
murder  of  John  Lennon,  the  Contra  scandal  and  a  decline  in 
the  value  of  the  dollar.  An  aura  of  the  1920's  gave  us 
"yuppies"  instead  of  "flappers"  and  the  return  of  both 
condoms  and  shoulder  pads.  In  the  USSR  we  saw 
Gorbachev,  glasnos  and  perastroika.  The  Iran-Iraq  war 
begins  and  Saddam  Hussein  becomes  the  President  of  Iraq. 
OPEC  agrees  to  cut  oil  prices,  cellular  phones  are 
introduced,  and  the  Space  Shuttle  Challenger  explodes  in 
the  air.  Britain  gives  Canada  the  right  to  amend  its 
constitution.  Governments  are  overthrown  in  Asia  and 
eastern  Europe,  and  the  BerUn  Wall  comes  tumbling  down. 

Supercomputers  and  NSFNET  changed  the  face  of  large- 
scale  computing  and  MACS,  PC's  and  clones  of  small-scale 
computing. 

BITNET  and  then  INTERNET  provided  electronic  mail, 
listservers  and  remote  logins  to  academic  users  throughout 
the  world. 

A  new  storage  technology,  the  tape  cartridge,  appears  on 
the  market.  It  holds  the  equivalent  of  8  milhon  cards  or  four 
times  that  of  a  6250  tape.  Five  megabyte  hard  drives 
became  available  for  microcomputers. 

IBM  finally  releases  a  microcomputer,  and  colleges  and 
universities  begin  to  take  this  technology  seriously. 

The  CD/ROM  provides  onhne  services  with  serious 
competition.  Cuadra  begins  issuing  a  Directory  of 
Databases  at  the  beginning  of  the  decade.  By  the  end  of  the 
decade  400  databases  have  become  4465,  and  the  new 
Directory  of  Portable  Databases  contains  409  CD/ROM 
products. 

Osborne  markets  the  first  portable  computer,  a  24  pound 
wonder  for  SI, 795  and  Apollo  markets  the  first  UNIX 
workstation.  At  the  middle  of  the  decade  Apple  launches 
the  Macintosh,  the  first  mouse-driven  computer  with  a 
graphic  user  interface  and  a  3  1/2"  floppy,  and  IBM 
markets  its  PC-AT  based  on  the  80286  Intel  chip.  The 
going  price  for  each  of  these  is  about  54,000. 


mainframes  as  servers  for  social  science  data. 
Word  processing,  database  management  systems  and 
spreadsheet  programs  are  the  most  widely-used 
microcomputer  products.  Listserv  software  is  developed, 
and  for  programmers  C-i-i-  emerges  as  the  dominant  object- 
oriented  language. 

Relational,  multi-platform  database  systems  like  ORACLE. 
INGRES  and  INFORMIX  are  developed. 

Traditional  statistical  packages  add  data  management 
capabiUties  and  release  new  versions  for  UNIX-based 
machines  and  microcomputers. 

Data  services  in  traditional  Ubraries  begin  to  come  into 
their  own.  The  American  Library  Association  pubhshes 
Sue  Dodd's  "Cataloging  Machine  Readable  Data  Files:  An 
Interpretive  Manual."  A  revised  Chapter  9  renames  MRDF 
computer  files,  the  US  Joint  Committee  on  Printing 
explores  providing  computer  materials  as  part  of  the 
depository  library  program,  and  the  Research  Library 
Group  and  the  Association  of  Research  Libraries  begin  to 
address  these  issues. 

The  University  of  Michigan  Librar>'  sends  catalog  records 
for  all  of  ICPSR's  holdings  to  RLIN  and  regular  updates 
follow.  A  special  issue  of  Library  Trends  addresses  data 
issues  as  do  articles  in  every  library  pubhcation. 

Population  Index  becomes  the  first  bibhographic  journal  to 
cite  computer  files,  SOCIAL  FORCES  the  first  major 
social  science  journal  to  provide  guidehnes  for  citing 
MRDF  in  their  author  guidehnes,  and  the  Encyclopedia  of 
Population  carries  an  article  on  MRDF. 

Australia,  Sweden  and  Hungar\'  establish  data  archives,  and 
the  first  Data  Librarians  serve  on  the  ICPSR  Council. 

lASSIST  meets  in  Washington,  Grenoble,  Coronado 
Beach,  Philadelphia.  Ottawa,  Amsterdam,  Santa  Monica, 
Vancouver,  goes  back  to  DC.  and  then  to  Jerusalem. 
Robbin,  Gavrel  and  Rowe  are  succeeded  by  Brown  as 
president. 

More  and  larger  census  samples  are  released  by  the  US, 
Canada,  Norway,  Australia  and  Israel.  Sweden  provides  an 
online  product  using  their  basic  record  files. 


In  1983  TIME  names  the  computer  the  "Man  of  the  Year," 
and  by  the  end  of  the  decade  UKIX  workstations  with  high- 
resolution  graphics  are  the  mainstay  of  scientific  and 
engineering  computing  and  are  already  replacing  large 


New  titles  including  SIPP  and  SIPP-like  studies  and  the 
Luxemburg  Income  Studies  appear.  Additional  countries 
participate  in  the  International  Social  Survey  Program,  and 
the  USSR  participates  in  cooperative  survey  efforts. 
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ICPSR  celebrates  its  25th  anniversary,  and  the  Blalock 
report  recommends  the  major  restructuring  within  ISR 
which  has  finally  taken  place. 

In  this  current  decade  a  newly  rechristened  data  service  at 
Princeton  has  a  second  brush  with  death,  but  thanks  to  an 
outcry  of  both  internal  and  external  support  finally  moves 
from  CIT  to  Firestone  Library's  Social  Science  Reference 
Center  where  business  booms,  especially  in  providing 
financial  and  other  economic  data.  I  prepare  for  retirement 
and  for  more  time  with  my  family  which  now  includes  five 
grandsons. 

The  1990's 

Nelson  Mandela  is  freed  after  27  years  in  prison;  the 
Hubble  Space  Telescope  is  launched;  Bush  and  Gorbachev 
agree  to  cut  nuclear  arms  and  chemical  weapons;  and 
Yeltsin  is  elected  president  of  Russia.  Iraq  invades  Kuwait, 
and  the  Gulf  War  begins.  The  world  population  exceeds 
5.2  billion.  Hundreds  of  thousands  of  Rwandans  die  in  a 
brutal  civil  war,  and  war  continues  through  the  decade 
among  the  republics  of  Yugoslavia.  Kim  Campbell 
becomes  Canada's  first  female  Prime  Minister,  and  both 
Mother  Theresa  and  Princess  Diana  die  in  1997.  New 
leaders  take  over  in  Indonesia  and  Nigeria.  Dental  and 
corneal  implants,  and  titanium  knees  become 
commonplace. 

By  1991  threeout  of  four  U.S.  homes  own  "VCR's,  the 
fastest  selling  domestic  appliance  in  history.  By  1994  the 
U.S.  government  privatizes  Internet  management,  and  in 
1995  Sony  demonstrates  the  flat  TV,  Denmark  announces 
plans  to  put  much  of  the  nation  online  by  the  end  of  the 
decade,  and  major  newspapers  throughout  the  world 
become  web-accessible. 

The  beautiful  NEXT  developed  by  Steve  Jobs  has  neither 
software  nor  customers.  By  the  end  of  the  decade  Apple 
makes  a  comeback  with  a  "decorator"  machine. 

Electronic  communication  becomes  almost  commonplace 
throughout  the  world.  More  and  more  text  and  numeric  data 
become  available  online  as  file  servers,  and  networks  and 
remote  logons  become  widely  available. 

Gopher,  developed  at  the  University  of  Minnesota  in  1991 , 
is  replaced  by  the  World  Wide  Web,  developed  in 
Switzerland.  In  1992  there  are  20  web  servers,  m  1993, 
200,  in  1996,  100,000  and  in  1998  3.8  million.  Everyone 
has  a  web  page  and  some  concern  has  developed  about 
archiving  things  of  value  and  about  sorting  the  wheat  from 
the  chaff. 


Netscape  replaces  Mosaic,  and  search  tools  of  numerous 
varieties  become  available  until  in  1999  1,000  discrete 
engines  have  been  identified.  As  an  increasing  number  of 
data  points  become  available  for  times  series  analysis, 
economists  become  major  users  of  microdata  as  well  as  of 
financial  data. 

The  periodicals  component  of  the  acquisitions  budgets  at 
most  University  Libraries  has  increased  from  roughly  50% 
to  over  75%  leaving  less  money  for  monographs.  Sales  of 
books  to  these  libraries  by  University  Presses  has  dropped 
by  half,  and  the  Presses  are  publishing  more  books  but 
fewer  scholarly  ones. 

lASSIST  meets  in  Poughkeepsie,  Alberta,  Madison, 
Edinburgh,  San  Francisco,  Quebec.  Minneapolis,  Odense 
and  New  Haven  and  this  week  celebrates  its  25th 
anniversary  in  Toronto. 

Stephenson,  Humphrey  and  Bumhill  serve  as  lASSIST 
presidents. 

What  does  the  next  decade  hold?  We  can  only  guess. 

We  would  anticipate  more  resistance  to  decennial  censuses 
but  more  public  use  microdata  from  those  censuses  which 
are  completed. 

An  increase  in  local  service  data  Ubraries  in  service 
environments  with  primary  data  becoming  a  routine  part  of 
library  collections  and  data  analysis  a  routine  part  of 
education  at  all  levels. 

Standard  cataloging  and  citation  will  become  routine. 

Image  cataloging  and  image  databases  will  make 
collections  of  pictures,  slides,  artifacts,  etc.  increasingly 
available  to  students  and  scholars. 

Data  files  will  become  even  larger  and  even  more  complex. 

Large  memory  UNIX  workstations  with  high  resolution 
graphic  monitors  will  replace  PC's  and  MACS,  and 
network  connections  will  become  faster,  more  reliable,  and 
probably  more  expensive. 

And  lASSIST  will  grow  and  prosper,  and  we  will  all  live 
happily  ever  after  ...  friends  and  colleagues  to  the  end. 

'  Presented  at  the  25th  Anniversary  lASSIST  Conference, 
May  17,  1999,  at  Ryerson  Polytechnic  University,  Toronto, 
Ontario. 
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Democratizing  Access  to  Data: 
The  American  Religion  Data  Archive 


by  Roger 
McKinnev 


Democratizing  Access  to  Data:  The 
American  Religion  Data  Archive 
(www.TheARDA.com) 

From  its  beginning,  the  American 

Religion  Data  Archive  (ARDA)  was 

developed  to  provide  immediate  access  to 

the  best  data  on  American  religion  at  no 

charge.    Starting  in  1997,  the  ARDA  was      ^^Bi^^^^H 

created  as  an  Internet-based  archive  and 

was  designed  to  serve  a  highly  diverse  audience.  But 

serving  a  diverse  audience,  including  many  with  little  or  no 

background  in  the  social  sciences,  required  ARDA  to  meet 

the  rigorous  methodological  standards  of  the  social  science 

community  and  still  be  easily  used  by  those  without  a 

knowledge  of  statistics,  research  design  or  data 

management.  Since  its  inception,  the  ARDA  has  attempted 

to  democratize  access  to  data,  without  compromising  the 

integrity  of  the  data  being  archived. 

This  essay  will  review  our  efforts.  We  begin  by  giving  a 
brief  overview  of  the  data  we  archive  and  the  audience  we 
serve.  Next,  we  will  review  the  goals  of  the  ARDA  and 
how  we  attempt  to  achieve  each  goal.  Although  the  goals 
are  similar  to  many  other  archives,  we  will  highlight  how 
we  have  developed  features  that  allow  us  to  achieve  these 
goals  in  new  and  creative  ways. 

Religion  Data  Sources  and  Users 

When  ARDA  was  initially  conceived,  the  1995-96  ICPSR 
Guide  to  Resources  and  Ser\>ices  reported  on  more  than 
40,000  data  files  from  over  3,000  social  research  studies. 
Even  a  topic  such  as  education,  which  had  comparatively 
few  entries,  reported  1 19  data  files  from  65  studies,  with  34 
of  these  studies  being  conducted  since  1980.  By 
comparison,  the  subheading  of  religion  reported  only  9  data 
files  from  9  studies,  with  only  two  of  the  studies  being 
conducted  after  1980.  Yet,  this  paucity  of  archived  data  on 
reUgion  does  not  mean  that  data  are  not  being  collected. 
Over  the  last  1 0  years  alone.  Lilly  Endowment  has  funded 
over  150  grants  with  a  data  collection  component,  the  Pew 
Charitable  Trusts  has  funded  several  major  national  and 
international  surveys,  and  many  denominations  support 
research  divisions  that  collect  large  amounts  of  data  each 
year.  Unlike,  education,  health  care  and  other  substantive 
areas,  where  most  studies  are  funded  by  government 
sources,  nearly  all  of  the  data  collections  on  religion  are 
funded  by  private  endowments  or  religious  organizations. 
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Most  funding  sources  have  either  wanted 
the  data  to  remain  "in  house"  or  have  not 
required  principal  investigators  to  place 
the  data  files  in  a  public  archive. 

In  the  mid-1990s,  however,  the  Lilly 
Endowment  began  a  major  initiative  for 
^^^^^^^■i     improving  dissemination.    One 

component  of  this  initiative  was  the 
American  Religion  Data  Archive.    After  awarding  a 
planning  grant  to  Roger  Finke  in  1996  to  study  the 
feasibility  of  starting  a  reUgion  archive,  Lilly  Endowment 
funded  the  start  up  and  operation  of  the  ARDA  from  1997- 
2000.  Recently,  they  extended  the  support  until  2003. 
Thus,  the  funding  sources  for  the  collection  and  archiving 
of  data  on  American  religion  remain  private  sources. 

The  ARDA  currently  holds  1 20  data  files  and  the  number 
should  approach  150  by  the  close  of  1999.  These  studies 
include  national  samples  of  the  United  States  and  Canada, 
regional  samples  of  selected  communities  or  areas,  and 
samples  of  selected  reUgious  groups  or  professionals. 
Although  all  surveys  include  the  topic  of  reUgion,  the 
survey  items  span  a  wide  range  of  other  topics  (e.g.,  from 
involvement  in  small  groups  and  pohtics  to  attitudes  on 
race  relations  and  professional  development).  In  addition  to 
the  surveys,  the  ARDA  also  distributes  data  on  American 
religion  by  ecological  units,  such  as  the  Association  of 
Statisticians  of  American  Religious  Bodies"  data  on 
churches  and  church  membership  by  counties  and  states  for 
1980  and  1990.  Both  the  size  and  the  diversity  of  the 
collection  will  continue  to  grow. 

Once  established,  the  greatest  challenge  for  the  ARDA  was 
appealing  to  the  diverse  audience  interested  in  American 
reUgion.  Initially,  we  were  most  aware  of  the  social 
scientists  from  research  universities  who  frequently 
conduct  and  report  on  the  major  data  collections.  For  this 
group,  the  ARDA  was  a  valued  repository  of  past  data 
collections  and  a  source  of  new  data  for  future  research 
studies.    But  this  audience,  often  sophisticated  in  research 
methods  and  statistics,  represents  only  a  small  portion  of 
the  total  audience.  Many,  and  probably  most,  of  our  users 
have  little  background  in  the  social  sciences  and  are  not 
located  at  research  universities.  Instead,  many  are  faculty 
members  and  students  located  at  small  universities, 
colleges,  and  seminaries  that  previously  had  Uttle  access  to 
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data  on  American  religion.    Based  on  our  web  site  reports, 
seminaries  have  made  more  contacts  and  referrals  to  the 
ARDA  site  than  any  other  type  of  educational  institution. 
And,  though  we  have  no  record  of  individual  users,  our 
most  frequent  e-mail  and  telephone  inquiries  come  from 
journalists  and  students.  Several  instructors  have  informed 
us  that  they  have  incorporated  ARDA  data  files  and 
software  into  class  assignments.  Rather  than  limiting 
access  to  a  small  group  of  researchers,  ARDA  has 
democratized  access  to  the  data,  and  a  very  disparate 
audience  is  taking  advantage  of  this  access.    Below  we 
review  how  we  appeal  to  this  disparate  audience  as  we 
strive  to  achieve  standard  archiving  goals. 

Goals  of  the  American  Religion  Data  Archive 

The  goals  of  the  ARDA  are  similar  to  those  of  many  other 
archives.  ARDA  was  established  to: 

1 .  Preserve  Data 

2.  Improve  Access  to  Data 

3.  Increase  the  Use  of  Data 

4.  Allow  Comparison  Across  Data  Files 

To  achieve  these  goals  we  combine  proven  archiving 
practices  with  new  attempts  to  serve  a  diverse  audience. 

The  first  goal,  preserving  data,  is  the  foundation  of 
virtually  all  archives,  and  in  the  case  of  data  on  American 
religion,  it  was  the  most  essential.  Of  the  first  150  data 
files  we  received  for  the  ARDA  only  three  were  previously 
held  in  a  pubhc  archive.  Preparing  the  data  for  the  archive 
follows  many  of  the  same  procedures  developed  by  other 
scholarly  archives.  After  we  receive  the  data  files,  we 
verify  the  accuracy  of  the  data  by  comparing  our  variable 
frequencies  with  those  of  the  principal  investigator  and  we 
begin  collecting  summary  information,  or  metadata.  For 
each  of  the  files  we  offer  a  brief  abstract  of  the  study  and 
we  provide  information  on  the  number  of  cases,  number  of 
variables,  the  year  it  was  conducted,  sampling  techniques, 
sources  of  funding,  principal  investigators,  collection 
procedures,  any  related  publications  and  additional 
information  on  the  construction  of  indices  or  the  use  of 
weight  variables  when  appropriate. 

In  our  effort  to  "democratize"  access  to  the  data,  however, 
we  have  gone  beyond  the  standard  procedures  used  to 
prepare  data  files  for  scholarly  research.  We  have  added  a 
couple  features  that  make  the  data  files  more  accessible  and 
easier  to  use.  First,  we  recreate  the  original  survey 
instrument  within  the  data  set.    Using  the  original 
questionnaire,  we  record  the  complete  variable  description 
and  all  response  categories.  Users  are  not  forced  to  keep  a 
codebook  by  their  side  to  decipher  variable  names  or 
truncated  descriptions.  Moreover,  when  the  files  are 
downloaded  as  MicroCase  files  the  entire  survey  wording 
remains.'  Second,  we  have  designed  the  web  site  so  users 
are  forced  to  review  the  metadata  before  they  download 


files,  and  they  can  easily  link  to  the  metadata  whenever 
they  are  reviewing  questions  or  data  from  the  file.  This  is 
handy  for  experienced  researchers  and  essential  for  those 
with  less  experience. 

Improving  access  to  data,  the  second  ARDA  goal,  was 
primarily  achieved  by  adding  an  easy  download  feature  to 
the  site.  Thanks  to  the  support  of  the  Lilly  Endowment, 
anyone  with  access  to  the  Internet  can  download  the  data 
free  of  charge.  Once  users  find  a  data  file  they  want  to  use, 
they  can  easily  download  it  to  their  own  PCs  as  an  SPSS, 
ASCII  or  MicroCase  file.  They  also  have  the  option  of 
downloading  a  codebook  without  the  data. 

Once  again  we  have  added  a  feature  that  allows  the  data  to 
be  used  by  non-specialists.  MicroCase  Corporation's 
statistical  software,  Explorit,  can  be  downloaded  free  of 
charge  and  is  fully  compatible  with  the  MicroCase  data 
files  available  from  our  site.  The  Explorit  software  is  used 
by  thousands  of  social  science  students  each  year 
throughout  the  United  States  and  Canada  and  is  remarkably 
easy  to  use.  The  Explorit  version  offered  from  the  ARDA 
site  holds  fewer  statistical  options  than  the  version  typically 
distributed  for  classroom  use,  but  it  offers  an  important 
option  for  non-specialists  who  do  not  have  a  statistical 
package  readily  available.-  Many  professors  have  found 
this  to  be  an  especially  attractive  option  for  their  students. 

For  the  third  goal,  increasing  the  use  of  the  data,  we 
wanted  to  allow  users  to  conduct  basic  analyses  of  the  data 
files  on-line.  Yet,  from  our  own  classroom  experiences 
with  undergraduates,  we  knew  how  confusing  bivariate 
cross-tabular  analysis  can  be  for  those  not  familiar  with 
statistics.  First,  constructing  the  table  requires  students  (or 
any  user)  to  fill  in  boxes  that  ask  for  an  independent  and 
dependent  variable  —  unfamiliar  and  unfriendly  words  for 
most.  Second,  they  must  select  variables  with  an 
appropriate  number  of  categories.  For  example,  when  a 
student  tries  to  set  up  a  table  with  age  by  income,  the 
resulting  table  might  offer  an  incomprehensible  70  columns 
and  20  rows.  And,  even  if  they  are  successful  in 
constructing  an  appropriate  table,  they  need  to  know  which 
way  to  percentage  the  table.    Choosing  to  percentage  in  the 
wrong  direction  leads  to  meaningless  or  often  misleading 
results. 

We  have  avoided  this  quagmire  by  working  with 
MicroCase  Corporation  to  develop  a  simplified  version  of 
their  auto-analyzer  for  our  web  site.  When  users  find  a 
question  of  interest,  they  can  click  on  a  button  called 
"Analyze"  and  tables  are  constructed  using  preset  variables. 
The  tables  are  percentaged  correctly  and  typically  include 
standard  demographic  variables  such  as  age.  gender, 
income,  marital  status  and  education.  This  avoids  the 
potential  problems  of  choosing  an  independent  and 
dependent  variable  or  deciding  which  way  to  percentage.' 
For  example,  if  a  question  is  selected  that  asks  "which 
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party's  candidate  would  you  be  most  likely  to  support  if  a 
federal  election  were  held  tomorrow?,"  the  user  would  first 
see  a  table  summarizing  the  number  and  the  percentage  of 
respondents  who  would  vote  for  each  candidate.  Then  a 
series  of  tables  would  follow,  showing  how  these 
percentages  and  numbers  vary  by  age,  gender,  income  and 
so  forth.  The  user  has  received  a  series  of  meaningful 
tables  on  the  question  of  interest,  without  struggling 
through  a  series  of  commands. 

The  fourth  goal,  allowing  comparisons  across  data  files 
and  over  time,  is  achieved  through  standard  searches.  The 
user  can  search  for  a  topic  of  interest  within  a  single  data 
file,  a  selected  group  of  data  files,  or  all  ARDA  data  files. 
After  locating  questions  of  interest,  the  user  can  quickly 
compare  the  results  for  each  question  by  conducting  on-line 
analysis  or  they  can  compare  the  data  files  from  which  the 
questions  were  selected.  Thus,  users  can  quickly  compare 
similar  questions  to  see  if  they  offer  equivalent  results,  and 
they  can  review  information  about  the  data  files  to  better 
understand  why  the  results  might  differ  (e.g.,  the  samples 
might  vary  by  location,  time  or  religion). 

Once  users  receive  questions  from  their  searches,  they  can 
also  place  the  questions  in  their  own  question  bank.  In 
other  words,  they  can  start  saving  questions  for  their  own 
survey.  During  the  planning  phase  of  the  ARDA.  we  were 
encouraged  by  prospective  users  to  establish  an  archive  of 
questions  as  well  as  an  archive  of  data.  Because  the 
complete  survey  questions  are  entered  and  stored  in  the 
data  file,  however,  the  data  file  represents  a  complete 
record  of  the  survey  instrument.    Hence,  when  data 
collections  are  submitted,  ARDA  serves  as  an  archive  for 
the  questions  used  and  the  data  received.  By  combining 
the  question  bank  feature  with  the  search  feature,  the 
ARDA  becomes  a  rich  resource  for  constructing  a  new 
survey  as  well  as  using  a  previous  one. 

Summary 

We  recognize,  of  course,  that  ARDA's  initial  efforts  to 
democratize  access  to  data  are  simply  that:  initial  efforts. 
Still,  we  are  encouraged.    The  support  of  the  Lilly 
Endowment  has  made  the  archive  possible  and  has 
eliminated  the  barrier  of  financial  cost  for  using  the  data. 
The  availability  of  downloading  MicroCase's  Explorit 
software  and  using  their  on-line  analysis  tool  has  greatly 
reduced  the  barrier  of  data  analysis  for  a  larger  audience. 
And,  providing  data  files  that  offer  complete  question 
wording,  detailed  metadata,  verified  data,  and  muliple 
download  formats,  renders  a  rich  resource  to  the 
experienced  and  inexperienced  user  alike.  Reducing  each 
of  these  barriers,  and  extending  the  services  offered,  has 
helped  to  increase  the  use  of  the  data  and  expand  the 
diversity  of  the  audience  using  the  ARDA. 

Finally,  we  want  to  close  with  a  gentle  reminder  to 
ourselves  and  others.  Democratizing  access  to  data  and 


metadata  are  noble  goals  made  possible  by  recent  advances 
in  technology.  Yet,  we  should  remember  that  metadata  are 
often  an  empty  promise  unless  the  data  are  available;  and. 
easily  accessible  data  can  still  be  useless  (and  misleading) 
unless  they  are  carefully  conducted  and  prepared  data 
collections.  A  data  archive  will  still  be  judged  by  the 
quality  of  the  data  it  provides.  Hence,  just  as  evangelists 
close  each  revival  with  an  invitation  to  submit  to  the 
message  just  heard,  we  end  each  essay  and  presentation 
with  an  invitation  for  submitting  data.    If  you  have  data  on 
American  religion  to  submit,  or  you  know  of  data  that 
should  be  submitted,  contact  the  ARDA 
(archive@sri.soc.purdue.edul  or  download  a  submission 
form  from  our  web  site  (www.TheARDA.com}. 

'  Due  to  the  character  limitations  of  SPSS  for  variable 
descriptions,  some  of  the  questions  will  be  truncated  when 
SPSS  portable  files  are  downloaded. 

-  The  simplified  version  of  the  Explorit  software, 
downloaded  from  the  ARDA  web  site,  provides  univariate 
statistics  with  the  appropriate  bar  graphs  and  pie  charts, 
crosstabs  with  the  appropriate  statistics,  and  a  complete  List 
of  survey  questions  that  can  searched  for  a  topic  of  interest. 

'  If  the  variable  has  too  many  categories  for  constructing  a 
table,  the  user  receives  a  message  with  this  information. 

*  Paper  presented  at  the  lASSIST  Conference,  May  17, 
1999.  Ryerson  Polytechnic  University.  Toronto.  Ontario. 
Roger  Finke.  Jennifer  McKinney  and  Matt  Bahr,  The 
American  Religion  Data  Archive,  Department  of 
Sociology.  1365  Stone  Hall,  Purdue  University,  West 
Lafayette,  IN  47907-1365,  765-494-0081, 
archive@sri.soc.purdue.edu,  www.TheARDA.com 
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Climate  Indices  for  Use  in 
Social  and  Behavioral  Research 


Overview 

While  climate  influences  many  social  and 

behavioral  phenomena,  it  is  often  poorly 

or  incompletely  represented  in  social 

science  research.  Studies  of  elderly 

migration,  for  example,  often  rely  on  a 

single  variable  to  represent  the  full  set  of 

climatic  conditions  found  across  the  ^^^^^^^^ 

United  States  (Walters  1994b). 

Moreover,  there  is  no  reliable  guide  to  the  selection  of  the 

most  appropriate  climate  variables.  Any  single  construct 

such  as  winter  temperature  can  be  represented  by  a  variety 

of  indicators  —  minimum  daily  temperature,  average  daily 

temperature,  number  of  freezing  days,  number  of 

below-zero  days,  number  of  heating  degree-days,  etc. 

Although  observed  variables  are  essential  in  climatological 

research,  statistically  constructed  indices  may  be  more 

useful  for  many  social  and  behavioral  applications. 

This  report  describes  the  use  of  factor  analysis  to  create 
five  climate  indices  from  a  set  of  37  original  (observed) 
variables.  These  indices  represent  all  the  major 
components  of  near-surface  climate  variation  within  the 
United  States.  In  addition,  they  offer  at  least  three 
advantages  over  the  original  variables: 

1 )  While  any  individual  observed  variable  may  be 
affected  by  measurement  error,  each  index  incorporates 
the  variance  common  to  more  than  one  of  the  original 
variables.  For  instance,  the  difficulty  of  obtaining 
accurate  snowfall  measurements  will  produce  more  error 
in  the  observed  variable  (snowfall  depth)  than  in  an 
index  that  incorporates  both  snowfall  and  a  number  of 
related  measures. 

2)  The  five  indices  are  uncorrelated  and  represent 
nearly  90  percent  of  the  variance  within  the  original  set 
of  37  variables.  There  is  no  need  to  select  a  subset  of 
the  variables  for  use  in  multivariate  studies  since  all  five 
can  be  used  together  without  danger  of  multicollinearity. 

3)  The  data  set  is  readily  accessible  to  scholars  whose 
primary  interests  lie  outside  climatology.  (Appendix  A 
presents  the  complete  set  of  indices  for  almost  every 
first-order  weather  station  within  the  coterminous 
United  States.)  In  contrast,  many  of  the  data  files 
distributed  by  NOAA  require  expertise  in  the  use  of 


complex  and  sometimes 
disciphne-specific  data  formats. - 


b\  William  H.  Walters 


Along  with  the  climate  indices  (factor 
scores),  factor  analysis  produces  a  set  of 
factor  loadings  that  reveal  the 
relationships  among  the  original 
^^^^^^^^m     variables.  The  results  of  this  analysis 

confirm  that  American  climates  are 
dominated  by  strong  seasonal  influences.  In  particular, 
summer  air  moisture  and  temperature  are  not  closely  Unked 
to  the  corresponding  winter  conditions. 

Previous  Research 

Factor  analysis,  developed  for  use  in  psychometric 
research,  has  since  achieved  widespread  apphcation  in  the 
field  of  climatology  —  most  often  in  the  construction  of 
climate  classification  schemes.  R-mode  factor  analysis,  a 
variant  of  the  usual  technique,  can  be  used  to  reveal  the 
relationships  among  a  set  of  observed  cUmate  variables  and 
to  represent  those  variables  through  a  smaller  number  of 
factors.-  The  resulfing  indices  (factor  scores)  are  useful 
whenever  it  is  necessary  to  represent  the  full  range  of 
climate  variation  through  a  limited  number  of  variables,  or 
whenever  the  underlying  components  of  climate  are  more 
important  than  the  observed  values  themselves.  As  a 
predictor  of  retirement  migration,  for  example,  an  index  of 
winter  climate  severity  is  probably  more  meaningful  than 
the  number  of  snow  days  or  the  average  January 
temperature  (Walters  1994a). 

Richman  (1986)  reviews  the  use  of  factor  analysis  in 
climate  research.  He  describes  six  modes  of  analysis, 
which  can  be  used  to  ( 1 )  classify  geographic  locations 
according  to  climate,  (2)  identify  time  periods  in  which 
climatic  conditions  remained  stable,  and  (3)  represent  a 
large  number  of  cUmate  variables  through  a  smaller  number 
of  factors.  While  many  authors  have  focused  on  the  first 
two  goals,  only  a  few  have  conducted  the  R-mode  analyses 
that  meet  the  third  objective.  Micklin  and  Dickason 
(1981 ),  for  example,  found  that  16  climate  indicators  for 
the  Soviet  Union  could  be  adequately  represented  by  just 
four  factors.  These  factors  —  aridity,  continentality, 
atmospheric  turbidity,  and  thermality  —  captured  85%  of 
the  variance  within  the  original  set  of  variables.  Similar 
analyses  have  been  undertaken  for  Australia 
(Puvaneswaran  1990),  Canada  (Powell  1977),  Greece 
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(Bartzokas  and  Metaxas  1995),  Nigeria  (Olaniran  1986), 
and  Pakistan  (Oliver  et  al.  1978).  Using  data  for  the  state 
of  Maine.  Briggs  and  Lemin  (1992)  found  that  37  chmate 
indicators  could  be  represented  by  just  three  constructed 
indices.  The  climate  of  Midland,  Texas,  is  apparently  more 
complex,  involving  up  to  ten  distinct  factors  (Ladd  and 
Driscoll  1980). 

Only  two  studies  have  presented  R-mode  factor  analysis 
results  for  the  entire  United  States.  Davis  and  Kalkstein 
(1990)  focus  on  weather  rather  than  climate,  however, 
while  Walters  (1994a)  uses  pre- 1970  data  and  evaluates 
only  those  sites  near  metropohtan  areas.  The  R-mode 
analysis  presented  here  is  based  upon  more  recent  data  and 
represents  the  full  range  of  climate  variation  within  the 
coterminous  United  States. 

Data  And  Methods 

Data  for  2 1 6  first-order  weather  stations  were  taken  from 
the  Local  Climatological  Data  series  of  the  National 
Oceanic  and  Atmospheric  Administration  (Wood  1996). 
Eighteen  stations  were  excluded  due  to  insufficient  data. 
The  temperature  and  precipitation  data  are  site-adjusted 
averages,  1961  to  1990.  All  other  variables  are  based  on 
measurements  made  prior  to  1994.  The  length  of  record 
varies  by  site  and  phenomenon  but  is  typically  30  to  50 
years. 

Principal  components  analysis  (PC A)  with  varimax 
rotation'  was  apphed  to  the  37  variables  shown  in  Table  1. 
These  variables  include  all  the  meaningful  components  of 
chmate:  annual,  summer,  and  winter  values  of  temperature, 
precipitation,  humidity,  cloud  cover,  wind  speed,  storm 
days,  fog  days  and  precipitation  days;  as  well  as  related 
indicators  such  as  snowfall,  wind  chill,  and  heat  stress. 

PCA.  hke  other  types  of  factor  analysis,  is  an  objective, 
empirical  procedure  that  reapportions  the  variance  within 
the  original  set  of  variables.  The  results  reflect  the  pattern 
of  correlations  among  these  variables  so  that  each  factor 
usually  represents  a  cluster  of  related  measures.  In  this 
instance,  87.8%  of  the  total  variance  can  be  represented  by 
just  five  factors  (five  indices).  These  factors  were  rotated 
and  interpreted  according  to  the  criteria  suggested  by 
Cattell  (1958),  Rummel  0970)  and  Thurstone  (1947). 

Results 

Varimax  rotation  always  produces  independent 
(uncorrected)  factors.  In  this  case,  each  factor  is 
conceptually  distinct  as  well.  That  is,  each  has  a  unique 
and  readily  identifiable  meaning.  {See  Table  I.) 

The  first  factor,  Fl,  represents  winter  temperature  and 
snowfall.  Locations  with  high  values  of  Fl  tend  to  have 
mild  winters,  relatively  few  freezing  days,  little  snowfall, 
and  only  modest  seasonal  temperature  variation.  In 
contrast,  sites  with  low  values  of  Fl  can  expect  severe 


winter  temperatures  and  heavy  snowfall.  To  a  lesser 
extent,  Fl  represents  annual  and  summer  temperatures. 
(High  values  of  Fl  correspond  to  high  temperatures 
throughout  the  year.)  Factor  1  is  not  a  straightforward 
indicator  of  summer  temperature,  however,  since  ( 1 ) 
another  factor.  F4,  represents  maximum  daily  temperature 
throughout  the  summer  months  and  (2)  the  summer 
temperature  variables  most  closely  associated  with  Fl  are 
strongly  related  to  F4  as  well.  While  winter  temperature  is 
fully  represented  by  Fl,  summer  temperature  fails  to 
emerge  as  a  single,  independent  component  of  the  climate 
system. 

The  second  factor,  F2,  is  a  summer  air-moisture  indicator 
representing  summer  precipitation,  cloud  cover,  humidity, 
and  storms.  While  summer  temperature  and  humidity  are 
often  thought  to  occur  in  tandem,  these  results  show  that 
the  two  phenomena  are  not  necessarily  related.  In 
particular,  only  one  of  the  variables  most  closely  associated 
with  F2  (heat  stress  —  humiture)  is  strongly  related  to  both 
Fl  and  F2. 

The  third  factor,  F3,  is  much  like  ¥2  but  represents  winter 
rather  than  summer  conditions.  Locations  with  high  values 
of  F3  tend  to  have  many  rainy  days,  heavy  cloud  cover  and 
high  humidity  throughout  the  cooler  months.  In  contrast, 
places  with  low  values  of  F3  are  distinguished  by  relatively 
clear,  dry  winters.  While  the  annual  air  moisture  variables 
have  high  loadings  on  both  F2  and  F3,  Factor  3  is  the  best 
single  indicator  of  year-round  precipitation,  cloud  cover, 
and  humidity. 

The  fourth  factor,  F4,  represents  those  aspects  of  summer 
temperature  not  included  in  Factor  1 .  In  particular,  summer 
maximum  daily  temperature  is  most  closely  related  to  F4. 
(High  values  of  F4  correspond  to  cool  summers.)  Table  1 
shows  that  the  other  summer  temperature  variables  are  also 
closely  linked  to  F4  even  though  their  primary  association 
is  with  Fl. 

The  fifth  factor.  F5.  is  primarily  a  wind-speed  indicator.  It 
incorporates  all  three  wind-speed  variables  (annual, 
summer,  and  winter)  as  well  as  the  number  of  days  with 
dense  fog. 

Taken  together,  the  factor  loadings  confirm  that  American 
chmates  are  dominated  by  strong  seasonal  influences. 
Rather  than  forming  a  single  precipitation  factor,  for 
instance,  the  various  precipitation  variables  combine  with 
other  air-moisture  indicators  (cloud  cover  and  humidity)  to 
create  two  distinct  seasonal  factors,  F2  and  F3.  Likewise, 
summer  temperature  is  at  least  partly  independent  of  winter 
temperature.  Of  the  several  components  of  climate,  only 
wind  speed  and  fog  (Factor  5)  fail  to  display  strong 
seasonal  independence. 
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The  climate  indices  (factor  scores)  for  each  weather  station 
are  presented  in  Appendix  A.""  By  mapping  the  highest  and 
lowest  scores,  we  can  identify  the  spatial  pattern  associated 
with  each  factor.  Figure  1  reveals  that  each  factor  is 
spatially  coherent  —  nearby  locations  have  similar  values 

—  and  that  each  has  a  distinctive  geographical  pattern. 
Winter/annual  temperature  and  snowfall  (Fl )  vary  with 
latitude,  for  instance,  while  summer  air  moisture  (F2)  is 
highest  in  the  Southeast  and  lowest  in  the  West.  Figure  1 
also  helps  illustrate  why  the  summer  temperature  variables 
are  associated  with  both  Fl  and  F4.  Factor  1  shows  the 
influence  of  latitude,  primarily,  while  F4  best  represents  the 
distinction  between  continental  and  marine  climates. 
Summer  temperature  is  therefore  a  function  of  both  latitude 
and  continentality.  In  contrast,  winter  temperature  and 
snowfall  can  be  adequately  represented  by  a  single  factor 
(Fl)  that  varies  chiefly  by  latitude. 

Conclusions 

The  American  climate  system  can  be  represented  by  just 
five  indices  —  five  sets  of  factor  scores.  Because  these 
scores  are  uncorrelated,  all  five  can  be  used  together  —  as 
explanatory  variables,  for  instance  —  without  danger  of 
muWcolhnearity. 

The  resuhs  of  this  analysis  are  consistent  with  previous 
research  on  the  factor  structure  of  American  climates.  In 
particular,  five  of  the  six  factors  identified  in  an  earlier 
study  (Walters  1994a)  can  be  seen  here  as  well.  This 
suggests  that  the  factor  structure  has  not  changed  over  time 
and  that  it  does  not  vary  when  new  locadons  are  added  to 
the  analysis.  The  relationships  observed  here  are  not 
necessarily  vaUd  for  other  countries  or  for  particular 
regions  of  the  U.S.,  however.  The  climate  of  Queensland, 
Australia,  for  example,  does  not  display  strong  seasonality 
(Puvaneswaran  1990).  Likewise,  the  climates  of  Nigeria 
(Olaniran  1986),  Pakistan  (Ohveret  al  1978)  and  Maine 
(Briggs  and  Lemin  1992)  are  dominated  by  regional  and 
local  factors  not  present  in  the  United  States  at  the  national 
level. 

Notes 

1 .  See,  for  example,  the  First  Order  Summary  of  the  Day 
(http://www.ncdc.noaa.gov/onlineprod/tfsod/climvis/ 
ftppage.html). 

2.  Richman  (1986)  provides  a  good  overview  of  this 
technique. 

3.  Several  obUque  and  orthogonal  rotation  methods  were 
evaluated  empirically.  While  each  method  generated  a 
similar  set  of  factors,  varimax  gave  the  most  robust  results 

—  the  results  that  changed  the  least  when  random  variadon 
(representing  error)  was  added  to  the  original  climate 
variables. 

4.  A  machine-readable  version  of  Appendix  A  is  available 
from  the  author. 
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Table  1 .  Rotated  Factor  Loadings ' 


Variable 


Fl 


F2 


F3 


F4 


F5 


freezing  days  (annual) 

-0.97 

0.95 

min  daily  temp  (winter) 

0.97 

— 

— 

— 

— 

0.95 

avg  daily  temp  (winter) 

0.97 

— 

— 

— 

— 

0.96 

heating  degree  days  (annual) 

-0.96 

— 

— 

— 

— 

0.98 

zero-degree  days  (annual)  * 

-0.95 

— 

— 

— 

— 

0.92 

avg  daily  temp  (annual) 

0.94 

— 

— 

— 

— 

0.99 

snow  days  (annual)  * 

-0.94 

— 

— 

— 

— 

0.92 

snowfall  (annual)  * 

-0.94 

— 

— 

— 

— 

0.92 

wind  chill  (winter) 

0.93 

— 

— 

— 

— 

0.96 

seasonal  temp  variation 

-0.80 

— 

— 

-0.41 

— 

0.85 

cooling  degree  days  (annual) 

0.79 

— 

— 

-0.40 

— 

0.92 

storm  days  (winter)  * 

0.72 

0.44 

— 

— 

— 

0.75 

avg  daily  temp  (summer) 

0.70 

— 

-0.30 

-0.51 

— 

0.92 

heat  stress  —  THI  (summer) 

0.68 

0.58 

— 

-0.33 

— 

0.92 

ninety-degree  days  (annual) 

0.66 

— 

-0.33 

-0.53 

— 

0.84 

precipitation  (summer) 



0.91 







0.91 

precipitation  days  (summer) 

— 

0.88 

— 

— 

— 

0.88 

storm  days  (annual) 

— 

0.84 

— 

-0.30 

— 

0.87 

storm  days  (summer) 

— 

0.81 

— 

— 

— 

0.78 

cloud  cover  (summer) 

— 

0.75 

0.37 

0.38 

— 

0.87 

heat  stress  —  humiture  (summer) 

0.48 

0.74 

— 

— 

— 

0.86 

humidity  (summer) 

— 

0.67 

0.45 

0.41 

— 

0.86 

precipitation  (annual) 

0.32 

0.63 

0.47 

0.34 

— 

0.86 

humidity  (winter) 

— 

— 

0.89 

— 

— 

0.80 

cloud  cover  (winter) 

-0.37 

— 

0.88 

— 

— 

0.92 

precipitation  days  (winter) 

— 

— 

0.81 

0.41 

— 

0.86 

cloud  cover  (annual) 

-0.44 

0.35 

0.74 

— 

— 

0.91 

humidity  (annual) 

— 

0.49 

0.71 

0.30 

— 

0.85 

precipitation  days  (annual) 

-0.31 

0.44 

0,69 

0.36 

— 

0.89 

precipitation  (winter) 

0.42 

— 

0.53 

0.51 

— 

0.74 

fog  days  (summer)  * 

— 

0.42 

— 

0.70 

— 

0.79 

max  daily  temp  (summer) 

0.55 

— 

-0.39 

-0.65 

— 

0.91 

wind  speed  (annual) 

— 

— 

— 

— 

0.92 

0.94 

wind  speed  (summer) 

— 

— 

— 

— 

0.88 

0.89 

wind  speed  (winter) 

— 

— 

— 

— 

0.87 

0.92 

fog  days  (winter) 

— 

— 

0.37 

— 

0.61 

0.68 

fog  days  (annual) 

— 

— 

— 

0.58 

0.58 

0.75 

%  variance  explained 

39.3 

25.5 

11.4 

8.2 

3.5 

cumulative  % 

39.3 

64.8 

76.2 

84.3 

87.8 

■  FYincipal  components  analysis  with  vanmax  rotation.  Annual  =  average  for  all  months.  Summer  =  average  for  June.  July,  and  August. 
Winter  =  average  for  December.  January,  and  February.  Values  in  bold  type  are  the  highest  loadings  for  each  variable.  Loadings  between 
-0.30  and  0.30  are  not  shown.  Variables  marked  with  an  asterisk  (*)  were  entered  m  cube  root  form  to  maintain  linearity.  Communality 
(h-)  indicates  the  proportion  of  the  variance  within  each  variable  that  is  shared  with  the  other  variables  in  the  set. 
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Climate  indices  (factor  scores  —  regression  method)  for  216  first-order  weather  stations  in  the  coterminous  United  States. 
Sixteen  stations  were  excluded  due  to  insufficient  data.  Each  factor  has  a  mean  of  0.00  and  a  standard  deviation  of  1.00. 


Weather  Station 


State 


Fl 


F2 


F3 


F4 


F5 


Binningham 

AL 

0.74 

0.88 

0.28 

-0.07 

-0.95 

Huntsville 

AL 

0.66 

0.78 

0.38 

0.06 

-0.50 

Mobile 

AL 

L36 

1.67 

0.22 

-0.14 

0.30 

Montgomery 

AL 

J. 18 

0.81 

0.22 

0.02 

-0.74 

Fort  Smith 

AR 

0.51 

0.43 

-0.02 

-0.66 

-0.52 

Little  Rock 

AR 

0.79 

1.02 

0.23 

-0.49 

-0.28 

Flagstaff 

AZ 

-LIO 

-0.32 

-1.93 

1.17 

-1.32 

Phoenix 

AZ 

L49 

-1.43 

-2.25 

-1.17 

-0.56 

Tucson 

AZ 

0.95 

-0.56 

-2.57 

-0.60 

-0.20 

Winslow 

AZ 

-0.29 

-0.94 

-1.85 

-0.67 

-0.20 

Yuma 

AZ 

L69 

-1.83 

-2.78 

-0.90 

0.18 

Bakersfield 

CA 

L49 

-2.86 

0.01 

-1.36 

0.03 

Fresno 

CA 

L58 

-2.92 

0.87 

-1.58 

0.46 

Long  Beach 

CA 

L65 

-1.97 

-1.25 

2.39 

-0.53 

Los  Angeles  (Airport) 

CA 

L63 

-1.90 

-1.36 

3.03 

-0.31 

Los  Angeles  (Civic  Center) 

CA 

1.70 

-1.79 

-1.76 

2.41 

-0.99 

Redding 

CA 

1.16 

-2.17 

0.23 

-0.82 

-0.18 

Sacramento 

CA 

1.55 

-2.77 

0.95 

-0.85 

0.69 

San  Diego 

CA 

1.59 

-1.85 

-1.23 

2.66 

-0.67 

San  Francisco  (Airport) 

CA 

1.33 

-2.46 

0.07 

1.68 

0.72 

Santa  Maria 

CA 

1.33 

-2.14 

-1.46 

3.96 

-0.22 

Stockton 

CA 

1.55 

-2.98 

1.03 

-1.14 

0.75 

Alamosa 

CO 

-1.59 

-0.43 

-1.54 

0.37 

-0.56 

Colorado  Springs 

CO 

-1.07 

0.40 

-2.86 

1.49 

-0.19 

Denver 

CO 

-0.92 

-0.14 

-1.87 

0.46 

-0.55 

Grand  Junction 

CO 

-0.48 

-1.22 

-0.72 

-1.34 

-0.30 

Pueblo 

CO 

-0.74 

-0.23 

-2.32 

-0.03 

-0.04 

Bridgeport 

CT 

-0.13 

0.06 

-0.38 

1.19 

0.74 

Hartford 

CT 

-0.53 

0.14 

-0.15 

0.97 

-0.46 

Washington  (Dulles) 

DC 

-0.11 

0.29 

-0.02 

0.70 

-0.72 

Washington  (National) 

DC 

0.20 

0.19 

-0.18 

-0.02 

-0.15 

Wilmington 

DE 

0.02 

0.17 

-0.06 

0.70 

-0.06 

Daytona  Beach 

FL 

1.56 

1.58 

-0.09 

0.01 

0.06 

Fort  Myers 

FL 

1.71 

2.44 

-0.60 

-0.57 

0.01 

Jacksonville 

FL 

1.40 

1.43 

0.06 

-0.03 

0.01 

Key  West 

FL 

2.01 

1.18 

-0.32 

-0.55 

0.61 

Miami 

FL 

1.77 

1.91 

-0.45 

-0.12 

-0.02 

Orlando 

FL 

1.61 

1.98 

-0.30 

-0.40 

0.18 

Pensacola 

FL 

1.46 

1.38 

0.26 

-0.15 

0.24 

Tallahassee 

FL 

1.44 

1.91 

0.06 

0.18 

-0.39 

Tampa 

FL 

1.62 

1.85 

-0.32 

-0.46 

0.03 

West  Palm  Beach 

FL 

1.72 

1.87 

-0.17 

-0.29 

0.06 

Athens 

GA 

0.86 

0.67 

-0.18 

0.71 

-0.37 

Atlanta 

GA 

0.80 

0.65 

-0.13 

0.59 

0.06 

Augusta 

GA 

0.91 

0.85 

-0.11 

0.24 

-0.79 

Columbus 

GA 

1.11 

0.86 

0.26 

-0.19 

-0.76 

Macon 

GA 

1.05 

0.80 

0.04 

0.00 

-0.41 

Savannah 

GA 

1.16 

1.35 

-0.28 

0.30 

-0.12 

Des  Moines 

lA 

-0.75 

0.59 

-0.09 

-0.70 

0.38 

Sioux  City 

lA 

-0.89 

0.41 

-0.15 

-0.75 

0.46 

Waterloo 

lA 

-1.02 

0.51 

0.09 

-0.53 

0.33 
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Boise 

ID 

-0.21 

-2.00 

0.71 

-1.04 

0.07 

Pocatello 

ID 

-0.84 

-1.56 

0.74 

-1.39 

0.35 

Chicago 

IL 

-0.72 

0.29 

0.44 

-0.41 

0.08 

Moline 

IL 

-0.70 

0.64 

0.00 

-0.39 

-0.02 

Peoria 

IL 

-0.52 

0.48 

0.41 

-0.56 

O.J  4 

Rockford 

IL 

0.83 

0.47 

0.27 

-0.31 

0.06 

Springfield 

IL 

-0.37 

0.45 

0.37 

-0.75 

0.45 

Evansville 

IN 

0.05 

0.37 

0.52 

-0.54 

-0.56 

Fort  Wayne 

IN 

0.58 

0.25 

0.86 

-0.48 

0.01 

Indianapolis 

IN 

0.33 

0.42 

0.75 

-0.39 

-0.06 

South  Bend 

IN 

0.67 

0.31 

1.16 

-0.42 

0.06 

Concordia 

KS 

0.49 

0.63 

-0.33 

-1.07 

0.42 

Dodge  City 

KS 

-0.21 

0.26 

-1.67 

-0.29 

0.74 

Topeka 

KS 

0.42 

0.86 

0.07 

-0.87 

-0.06 

Wichita 

KS 

0.07 

0.13 

-0.19 

-1.33 

0.23 

Jackson 

KY 

0.10 

0.80 

0.32 

1.31 

-0.66 

Lexington 

KY 

0.05 

0.54 

0.54 

-0.03 

-0.24 

Louisville 

KY 

0.02 

0.47 

0.42 

-0.25 

-0.65 

Paducah 

KY 

0.28 

0.69 

0.42 

-0.33 

-0.44 

Baton  Rouge 

LA 

1.42 

1.36 

0.36 

-0.18 

-0.22 

Lake  Charles 

LA 

1.59 

1.09 

0.83 

-0.58 

0.47 

New  Orleans 

LA 

1.56 

1.27 

0.71 

-0.45 

0.01 

Shreveport 

LA 

1.17 

0.41 

0.46 

-0.78 

-0.09 

Boston 

MA 

0.26 

0.02 

-0.50 

1.21 

0.80 

Worcester 

MA 

0.56 

0.04 

-0.48 

2.21 

0.61 

Baltimore 

MD 

0.07 

0.11 

-0.27 

0.55 

-0.06 

Caribou 

ME 

1.79 

0.39 

0.35 

0.69 

0.23 

Portland 

ME 

0.83 

0.12 

-0.39 

1.94 

-0.37 

Alpena 

MI 

1.30 

0.06 

0.83 

0.26 

-0.81 

Detroit 

MI 

0.66 

0.04 

0.85 

-0.32 

0.14 

Flint 

MI 

0.85 

0.05 

0.83 

-0.16 

-0.08 

Grand  Rapids 

MI 

0.83 

0.09 

1.32 

-0.40 

-0.07 

Houghton  Lake 

MI 

1.24 

-0.05 

1.07 

0.03 

-0.46 

Lansing 

MI 

0.88 

0.10 

1.15 

-0.41 

-0.10 

Muskegon 

MI 

0.82 

-0.11 

1.34 

-0.25 

0.07 

Sault  Ste.  Marie 

MI 

1.49 

0.09 

1.17 

0.68 

-0.38 

Duluth 

MN 

1.69 

0.48 

-0.21 

0.90 

0.42 

International  Falls 

MN 

2.07 

0.47 

-0.04 

0.09 

-0.64 

inneapolis-St.  Paul 

MN 

1.28 

0.38 

-0.13 

-0.58 

0.12 

Rochester 

MN 

1.26 

0.47 

0.23 

-0.47 

1.27 

St.  Cloud 

MN 

1.53 

0.28 

-0.21 

0.03 

-0.68 

Columbia 

MO 

0.21 

0.52 

0.13 

-0.43 

0.19 

Kansas  City- 

MO 

0.29 

0.61 

-0.35 

-0.51 

0.54 

Springfield 

MO 

O.OI 

0.63 

-0.08 

-0.44 

0.44 

St.  Louis 

MO 

0.05 

0.40 

0.38 

-0.90 

0.02 

Jackson 

MS 

1. 11 

0.92 

0.61 

-0.52 

-0.48 

Meridian 

MS 

1.12 

0.78 

0.39 

0.03 

-0.93 

Tupelo 

MS 

0.83 

0.58 

0.37 

-0.16 

-0.74 

Billings 

MT 

1. 11 

-0.64 

-0.89 

-O.I  I 

0.39 

Glasgow 

MT 

1.45 

-0.61 

0.01 

-1.12 

0.41 

Great  Falls 

MT 

1.29 

-0.54 

-0.64 

-0.13 

0.76 

Helena 

MT 

1.28 

-0.66 

-0.23 

-0.47 

-0.85 

Kalispell 

MT 

1.14 

-1.12 

1.31 

0.02 

-0.95 

Missoula 

MT 

0.95 

-1.22 

1.29 

-0.49 

-0.96 

Asheville 

NC 

0.20 

0.86 

-0.52 

2.18 

-0.42 

Cape  Hatteras 

NC 

0.98 

0.69 

0.40 

0.37 

0.57 
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Charlotte 

NC 

0.58 

0.46 

-0.31 

0.63 

-0.59 

Greensboro 

NC 

0.37 

0.62 

-0.36 

0.89 

-0.55 

Raleigh 

NC 

0.49 

0.64 

-0.41 

0.91 

-0.51 

Wilmington 

NC 

0.90 

1.09 

-0.13 

0.49 

-0.17 

Bismarck 

ND 

-L62 

-0.03 

-0.27 

-0.71 

-0.01 

Fargo 

ND 

-1.66 

0.14 

-0.09 

-0.83 

0.70 

Williston 

ND 

-L61 

-0.31 

-0.08 

-1.00 

0.02 

Grand  Island 

NE 

-0.86 

0.45 

-0.03 

-1.06 

0.43 

Lincoln 

NE 

-0.77 

0.46 

0.19 

-1.35 

0.17 

Norfolk 

NE 

-0.95 

0.45 

-0.63 

-0.69 

0.58 

North  Platte 

NE 

-1.02 

0.24 

-0.92 

-0.36 

0.13 

Omaha  (Eppley) 

NE 

-0.71 

0.55 

-0.35 

-0.69 

0.29 

Omaha  (North) 

NE 

-0.73 

0.55 

-0.45 

-0.41 

-0.17 

Scottsblujf 

NE 

-1.07 

-0.01 

-1.22 

-0.36 

0.11 

Valentine 

NE 

-1.22 

0.19 

-1.06 

-0.72 

-0.18 

Concord 

NH 

-1.02 

0.13 

-0.40 

1.55 

-1.01 

Mount  Washington 

NH 

-1.10 

0.34 

1.00 

4.13 

11.90 

Atlantic  City  fNAFEC) 

NJ 

-0.01 

0.21 

-0.18 

1.10 

0.20 

Newark 

NJ 

-0.05 

0.17 

-0.16 

0.34 

0.13 

Albuquerque 

NM 

-0.18 

-0.63 

-2.34 

-0.04 

-0.22 

Roswell 

NM 

0.22 

-0.46 

-2.18 

-0.05 

-0.05 

Elko 

NV 

-0.87 

-1.61 

-0.24 

-0.64 

-1.28 

Ely 

NV 

-1.26 

-1.20 

-1.11 

-0.59 

0.06 

Las  Vegas 

NV 

0.86 

-1.90 

-2.77 

-0.95 

0.31 

Reno 

NV 

-0.44 

-2.10 

-0.97 

-0.20 

-0.93 

Wirmemucca 

NV 

-0.67 

-1.92 

-0.46 

-0.89 

-0.51 

Albany 

NY 

-0.92 

0.27 

0.34 

0.34 

-0.52 

Binghamton 

NY 

-0.89 

0.15 

1.01 

0.77 

0.17 

Buffalo 

NY 

-0.81 

0.05 

1.56 

-0.34 

0.47 

New  York  (Central  Park) 

NY 

-0.02 

-0.02 

-0.30 

0.35 

-0.38 

New  York  (JFK) 

NY 

0.04 

0.10 

-0.46 

1.17 

0.75 

New  York  (La  Guardia) 

NY 

-0.02 

0.07 

-0.51 

0.70 

0.69 

Rochester 

NY 

-0.88 

-0.06 

1.29 

-0.17 

-0.38 

Syracuse 

NY 

-0.99 

0.15 

1.41 

-0.28 

-0.52 

Akron-Canton 

OH 

-0.61 

0.25 

1.00 

0.03 

-0.14 

Cincinnati 

OH 

-0.26 

0.46 

0.60 

-0.06 

-0.29 

Cleveland 

OH 

-0.67 

0.14 

1.23 

-0.42 

-0.03 

Columbus 

OH 

-0.46 

0.39 

0.72 

-0.07 

-0.67 

Dayton 

OH 

-0.40 

0.26 

0.74 

-0.27 

0.03 

Mansfield 

OH 

-0.58 

0.23 

0.95 

-0.08 

0.42 

Toledo 

OH 

-0.70 

0.20 

0.85 

-0.27 

-0.29 

Youngstown 

OH 

-0.72 

0.22 

1.25 

0.09 

-0.17 

Oklahoma  Cir\- 

OK 

0.42 

-0.01 

-0.37 

-1.11 

0.94 

Tulsa 

OK 

0.37 

0.30 

0.00 

-1.25 

0.30 

Astoria 

OR 

0.77 

-1.09 

2.33 

2.61 

-0.35 

Eugene 

OR 

0.87 

-2.00 

2.58 

0.79 

0.11 

Medford 

OR 

0.65 

-2.52 

1.85 

-0.60 

-0.49 

Pendleton 

OR 

0.00 

-2.29 

1.18 

-0.82 

0.26 

Portland 

OR 

0.57 

-1.65 

2.16 

0.52 

-0.34 

Salem 

OR 

0.56 

-1.91 

2.32 

0.51 

-0.49 

Allentown 

PA 

-0.34 

0.28 

0.02 

0.59 

-0.21 

Erie 

PA 

-0.72 

0.18 

1.50 

-0.42 

0.16 

Middletown/Harrisburg 

PA 

-0.22 

0.18 

-0.02 

0.49 

-0.79 

Philadelphia 

PA 

-0.02 

0.17 

-0.13 

0.49 

-0.04 

Pittsburgh 

PA 

-0.61 

0.23 

0.87 

0.15 

-0.54 

Wilkes-Barre/Scranton 

PA 

-0.64 

0.19 

0.42 

0.52 

-0.64 
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Waiiamsport 

PA 

-0.55 

0.49 

0.26 

0.86 

-0.79 

Providence 

RI 

-0.33 

0.08 

-0.33 

1.26 

0.16 

Cliarleston 

SC 

LOS 

1.35 

-0.09 

0.18 

-0.03 

Columbia 

SC 

0.87 

0.96 

-0.20 

0.29 

-0.68 

Greenville-Spartanburg 

SC 

0.66 

0.58 

-0.38 

1.07 

-0.62 

Aberdeen 

SD 

-1.43 

0.10 

-0.27 

-0.82 

0.46 

Huron 

SD 

-L30 

0.21 

-0.31 

-0.92 

0.57 

Rapid  City 

SD 

-J. 16 

-0.01 

-1.14 

-0.13 

0.37 

Sioux  Falls 

SD 

-L17 

0.33 

-0.20 

-0.82 

0.55 

Bristol 

TN 

0.08 

0.55 

0.14 

1.25 

-1.42 

Chattanooga 

TN 

0.57 

0.79 

0.25 

0.43 

-1.09 

Knoxxille 

TN 

0.34 

0.59 

0.30 

0.58 

-0.88 

Memphis 

TN 

0.79 

0.48 

0.38 

-0.73 

-0.14 

Nashville 

TN 

0.38 

0.63 

0.33 

-0.19 

-0.55 

Abilene 

TX 

0.73 

-0.23 

-l.Ol 

-1.09 

1.03 

Amarillo 

TX 

0.00 

0.00 

-2.08 

-0.18 

1.68 

Austin 

TX 

L40 

-0.21 

0.06 

-0.93 

0.45 

Brownsville 

TX 

2.03 

-0.52 

0.93 

-1.51 

1.55 

Corpus  Christ! 

TX 

L82 

-0.36 

0.85 

-1.38 

1.69 

Dallas-Forth  Worth 

TX 

L02 

-0.15 

-0.13 

-1.41 

0.74 

Del  Rio 

TX 

L29 

-0.60 

-0.63 

-1.28 

0.75 

El  Paso 

TX 

0.44 

■0.66 

-2.64 

-0.31 

-0.19 

Houston 

TX 

1.41 

0.67 

0.76 

-0.91 

-0.03 

Lubbock 

TX 

0.25 

-0.08 

-1.77 

-0.38 

1.21 

Midland-  Odessa 

TX 

0.64 

-0.54 

-1.68 

-0.61 

0.94 

Port  Arthur 

TX 

1.59 

1.05 

0.96 

-0.96 

0.75 

San  Angelo 

TX 

0.80 

-0.47 

-1.02 

-1.05 

0.58 

San  Antonio 

TX 

1.41 

-0.30 

0.20 

-1.19 

0.53 

Victoria 

TX 

1.63 

0.29 

0.81 

-1.21 

0.98 

Waco 

TX 

1.18 

-0.21 

0.20 

-1.73 

1.08 

Wichita  Falls 

TX 

0.67 

-0.06 

-0.62 

-1.43 

1.07 

Salt  Lake  Cit}' 

UT 

-0.36 

-1.30 

0.42 

-1.68 

0.05 

Norfolk 

VA 

0.56 

0.47 

-0.18 

0.50 

0.34 

Richmond 

VA 

0.26 

0.56 

-0.10 

0.51 

-0.57 

Roanoke 

VA 

0.01 

0.42 

■0.58 

0.95 

-0.63 

Burlington 

VT 

-1.34 

0.25 

0.51 

0.23 

-0.60 

Olympia 

WA 

0.54 

■1.67 

2.56 

1.89 

-0.19 

Quillayute 

WA 

0.76 

-0.73 

2.68 

3.43 

-1.17 

Seattle-Tacoma 

WA 

0.55 

-1.53 

1.78 

1.51 

-0.07 

Spokane 

WA 

-0.42 

-1.99 

1.53 

-0.35 

0.51 

Yakima 

WA 

-0.34 

-2.24 

0.82 

-0.84 

-0.52 

Green  Bay 

Wl 

-1.18 

0.23 

0.16 

0.14 

-0.05 

La  Crosse 

WI 

-1.10 

0.56 

-0.07 

-0.15 

-0.48 

Madison 

WI 

-1.06 

0.37 

0.25 

-0.14 

-0.04 

Milwaukee 

WI 

-0.85 

0.21 

0.20 

0.22 

0.51 

Beckley 

WV 

-0.47 

0.68 

0.68 

1.15 

-0.38 

Cliarleston 

WV 

-0.01 

0.73 

0.21 

1.71 

-0.86 

Elkins 

WV 

-0.70 

0.93 

0.70 

1.79 

-1.32 

Huntington 

WV 

-0.07 

0.62 

0.36 

1.24 

-1.00 

Casper 

WY 

-1.38 

■0.51 

-1.04 

■0.42 

0.85 

Cheyenne 

WY 

-1.23 

0.19 

-2.21 

1.07 

0.77 

Lander 

WY 

-1.40 

-0.89 

-1.30 

-0.11 

■1.12 

Sheridan 

WY 

-1.40 

-0.51 

-0.51 

-0.33 

-0.90 
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CWARTINO    THE    FUTURE    rOR    SOCIAL, 
SPATIAL    &    CaOVERNMENT    DATA    SERVICELS 


DATA  IN  THE  DIGITAL  LIBRARY: 
Charting  the  Future  for  Social,  Spatial  and  Government  Data 

June  7-1 0, 2000 
Northwestern  University 

The  Twenty-Sixth  (26)  Annual  Conference  of  the 

International  Association  for  Social  Science 

Information  Services  and  Technology  (lASSIST)  will 

be  held  on  the  campus  of  Northwestern  University 

in  Evanston,  Illinois  on  June  7-1 0,  2000.  This 

year's  conference  Data  in  the  Digital  Library: 

Charting  the  Future  of  Social,  Spatial  and 

Government  Data  emphasizes  the  strengthening 

relationships  between  archives  and  libraries  in 

managing,  preserving  and  providing  access  to 

"digital  collections". 

lASSIST  conferences  bring  together  data 
professionals,  data  producers,  and  data  analysts 
from  around  the  world  who  are  engaged  in  the 
creation,  acquisition,  processing,  maintenance, 
distribution,  preservation,  and  use  of  numeric 
social  science  data  for  research  and  instruction. 

httpy/www.spc.uchicago.edu/DATALIB/ia2000/ 
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lASSIST 


INTERNATIONAL  ASSOCIATION   FOR 
SOCIAL     SCIENCE      INFORMATION 
SERVICE  AND  TECHNOLOGY 
•  •  •  • 

ASSOCIATION  INTERNATIONALE  POUR 
LES  SERVICES  ET  TECHNIQUES 
D'INFORMATION  EN  SCIENCES 
SOCIALES 


Membership 
form 


The  International  Association  for 
Social  Science  Information  Services 
and  Technology  (lASSIST)  is  an 
international  association  of  individuals 
who  are  engaged  in  the  acquistion, 
processing,  maintenance,  and  distribu- 
tion of  machine  readable  text  and/or 
numeric  social  science  data.  The 
membership  includes  information 
system  specialists,  data  base  librarians 
or  administrators,  archivists,  research- 
ers, programmers,  and  managers.  Their 
range  of  interests  encompases  hard 
copy  as  well  as  machine  readable  data 

Paid-up  members  enjoy  voting  rights 
and  receive  the  lASSIST  QUAR- 
TERLY. They  also  benefit  from 
reduced  fees  for  attendance  at  regional 


and  international  conferences 
sponsored  by  lASSIST. 

Membership  fees  are: 

Regular  Membership.  S40.00 
per  calendar  year. 
Student  Membership:  $20.00 
per  calendar  year. 

Institutional  subcriptions  to  the 
quarterly  are  available,  but  do  not 
confer  voting  rights  or  other  member- 
ship benefits. 

Institutional  Subcription: 
$70.00  per  calendar  year 
(includes  one  volume  of  the 
Quarterly) 


I  would  like  to  become  a  member  of 
lASSIST.  Please  see  my  choice  below: 

Options  for  payment  in  Canadian  Dollars  and 
by  Major  Credit  Card  are  available.  See  the 
following  web  site  for  details: 
http://datalib.library.ualberta.ca/iassist/ 
mbrship2.htm! 

□  $40  (US)  Regular  Member 

□  $20  Student  Member 

□  $70  Subscription  (payment  must 
be  made  in  US$) 

□  List  me  in  the  membership 
directory 

□  Add  me  to  the  lASSIST  listserv 

Namp; 


Please  make  checks  payable, 
in  US  funds,  to  lASSIST  and 
Mail  to: 

lASSIST, 

Assistant  Treasurer 
JoAnn  Dionne 
50360  Warren  Road 
Canton,  Ml  48187 
USA 


Job  TitIP; 


Organization: 


Address: 


City: 

Postal  Code: 

Phone: 

E-mail: 


State/Province: 

Country: 

FAX: 
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